Bisimulation Metrics are Optimal Value Functions

نویسندگان

Norm Ferns

Doina Precup

چکیده

Bisimulation is a notion of behavioural equivalence on the states of a transition system. Its definition has been extended to Markov decision processes, where it can be used to aggregate states. A bisimulation metric is a quantitative analog of bisimulation that measures how similar states are from a the perspective of long-term behavior. Bisimulation metrics have been used to establish approximation bounds for state aggregation and other forms of value function approximation. In this paper, we prove that a bisimulation metric defined on the state space of a Markov decision process is the optimal value function of an optimal coupling of two copies of the original model. We prove the result in the general case of continuous state spaces. This result has important implications in understanding the complexity of computing such metrics, and opens up the possibility of more efficient computational methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Metrics for Markov Decision Processes with Infinite State Spaces

We present metrics for measuring state similarity in Markov decision processes (MDPs) with infinitely many states, including MDPs with continuous state spaces. Such metrics provide a stable quantitative analogue of the notion of bisimulation for MDPs, and are suitable for use in MDP approximation. We show that the optimal value function associated with a discounted infinite horizon planning tas...

متن کامل

Bisimulation Metrics for Continuous Markov Decision Processes

In recent years, various metrics have been developed for measuring the behavioural similarity of states in probabilistic transition systems [Desharnais et al., Proceedings of CONCUR, (1999), pp. 258-273, van Breugel and Worrell, Proceedings of ICALP, (2001), pp. 421-432]. In the context of finite Markov decision processes, we have built on these metrics to provide a robust quantitative analogue...

متن کامل

Knowledge Transfer in Markov Decision Processes

Markov Decision Processes (MDPs) are an effective way to formulate many problems in Machine Learning. However, learning the optimal policy for an MDP can be a time-consuming process, especially when nothing is known about the policy to begin with. An alternative approach is to find a similar MDP, for which an optimal policy is known, and modify this policy as needed. We present a framework for ...

متن کامل

An approximation theory for discrete event and continuous time systems

Established system relationships for discrete systems, such as language inclusion, simulation, and bisimulation, require system observations to be identical. When interacting with the physical world, modeled by continuous or hybrid systems, exact relationships are restrictive and not robust. In this paper, we develop the first framework of system approximation that applies to both discrete and ...

متن کامل

Optimal Stochastic Control in Continuous Time with Wiener Processes: General Results and Applications to Optimal Wildlife Management

We present a stochastic optimal control approach to wildlife management. The objective value is the present value of hunting and meat, reduced by the present value of the costs of plant damages and traffic accidents caused by the wildlife population. First, general optimal control functions and value functions are derived. Then, numerically specified optimal control functions and value func...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Bisimulation Metrics are Optimal Value Functions

نویسندگان

چکیده

منابع مشابه

Metrics for Markov Decision Processes with Infinite State Spaces

Bisimulation Metrics for Continuous Markov Decision Processes

Knowledge Transfer in Markov Decision Processes

An approximation theory for discrete event and continuous time systems

Optimal Stochastic Control in Continuous Time with Wiener Processes: General Results and Applications to Optimal Wildlife Management

عنوان ژورنال:

اشتراک گذاری